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Abstract 

This paper examines the performance of decision feedback based iterative channel estimation and muhiuser de- 
tection in channel coded aperiodic DS-CDMA systems operating over multipath fading channels. First, explicit ex- 
pressions describing the performance of channel estimation and parallel interference cancellation based multiuser 
detection are developed. These results are then combined to characterize the evolution of the performance of a system 
that iterates among channel estimation, multiuser detection and channel decoding. Sufficient conditions for conver- 
gence of this system to a unique fixed point are developed. 

I. Introduction 

Direct sequence code division multiple-access (DS-CDMA) has been selected as the fundamental sig- 



^ ■ naling technique for third generation (3G) wireless communication systems, due to its advantages of soft 
■ - - user capacity limit and inherent frequency diversity. However, it suffers from multiple-access interference 
(MAI) caused by the non-orthogonality of spreading codes, particularly for heavily loaded systems. There- 
fore, techniques for mitigating the MAI, namely multiuser detection, have been the subject of an intensive 
research effort over the past two decades. It is well known that multiuser detection can substantially sup- 
press MAI, thus improving system performance. Maximum likelihood (ML) multiuser detection [1281 was 
proposed in the early 1980s, and achieves the optimal performance at the cost of prohibitive computational 

Husheng Li is with Qualcomm Inc., San Diego, CA, 92121, USA (email: hushengl@qualcomm.com). Sharon M. Betz is with Department of 
Electrical Engineering, Princeton University, Princeton, NI 08544, USA (email: sbetz@princeton.edu). H. Vincent Poor is with Department of 
Electrical Engineering, Princeton University, Princeton, NJ 08544, USA (email: poor@princeton.edu). This research was supported by the Air 
Force Research Laboratory under Cooperative Agreement No. FA8750-05-2-01 92. 



2 

cost when the number of users is large. For practical implementation, suboptimal algorithms, such as the 
linear minimum mean square error (LMMSE) detector lETl or decorrelator lE^ . allow a tradeoff between 
complexity and performance. It should be noted that the technique of multiuser detection is being applied in 
existing CDMA systems, such as EV-DO Revision A systems lfT2ll . 

In recent years, the turbo principle, namely the iterative exchange of soft information among different 
blocks in a communication system to improve the system performance, has been applied to combine mul- 
tiuser detection with channel decoding IITIiI^IE^IE^IEtIiBTI . In such turbo multiuser detectors, the out- 
puts of channel decoders are fed back to the multiuser detector, thus enhancing the performance iteratively. 
Turbo multiuser detection based on the maximum a posteriori probability (MAP) detection and decoding 
criterion has been proposed in llBOl ll3Tl together with a lower complexity technique based on interference 
cancellation and LMMSE filtering. Further simplification is obtained by applying parallel interference can- 
cellation (PIC) Q for multiuser detection, where the decisions of the decoders are directly subtracted from 
the original signal to cancel the MAI. 

Practical wireless communication systems usually experience fading channels, whose state information 
is unknown to the receiver. Thus practical systems need to consider detection and decoding with uncertain 
channel state information. In the context of short code CDMA systems, blind multiuser detection can be 
accomplished without explicit channel estimation by using subspace and other techniques O^ . An alter- 
native receiver structure adopts an explicit channel estimation block and carries out the decoding with the 
corresponding channel estimate. In systems without decision feedback, the channel estimation block is cas- 
caded with the decoder and operates as a front end for the subsequent blocks. With such a receiver structure, 
the channel estimates can be obtained with training symbols or with blind estimation algorithms 
Explicit expressions for the performance of such channel estimation schemes are given in ifTTI and the cor- 
responding impact on multiuser detection is discussed in the large system limit in and ifTHI . In systems 
with decision feedback, the decisions of the decoder are fed back to the channel estimator to enhance its 
performance. In such systems, the channel estimator and the decoder can operate either simultaneously ll25ll 
or successively lfT3l ll23l . An example of the former strategy applied to ML sequence detection in uncer- 
tain environments is proposed in lE51l : called per-survivor processing, tentative decisions are immediately 
fed back to the channel estimation algorithm and the corresponding estimates are used for the detection 
of future symbols. In the latter strategy, the decisions are fed back only when the entire current decoding 
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procedure is finished. For example, in lfT3l . an expectation maximization (EM) channel estimation algo- 
rithm, combined with successive interference cancellation, is proposed. Joint channel estimation and data 
detection algorithms for uncoded single-antenna and multiple-antenna systems are discussed in [8| and Q, 
respectively. In channel coded systems, iteration can achieve better performance when the turbo principle is 
applied, due to the redundancy introduced by the code structure. In ll23ll . an iterative algorithm is proposed 
and analyzed for channel estimation and decoding of low-density parity-check (LDPC) coded quadrature 
amplitude modulation (QAM) systems. 

In this paper, we consider channel-coded CDMA systems operating over multipath fading channels whose 
channel state information is unknown to the receiver. To demodulate and decode such systems, we apply 
the turbo principle to both channel estimation and multiuser detection. As shown in Figure 1, we consider 
a receiver that feeds back decisions from channel decoders to both an ML channel estimator and a PIC 
multiuser detector. The iteration is initialized with training symbol based channel estimation and a non- 
iterative multiuser detection. The receiver structure is similar to those proposed in lEIlfT^lHEUIl . However, 
this paper is focused mainly on the performance analysis of such structures using semi-analytic methods. We 
analyze the contributions to the variance of the channel estimation error due to noise and decision feedback 
error, and the variance of the residual MAI after PIC. We then use this analysis to describe the decoding 
process as an iterative mapping. We also propose conditions assuring convergence of this iterative mapping 
to a unique fixed point. We further compute the asymptotic multiuser efficiency (AME) ll29ll of this overall 
system, under some mild assumptions on the channel decoders. It should be noted that the analysis in this 
paper is based on large sample and large system analysis. 

The remainder of this paper is organized as follows. Section II introduces the signal model and the chan- 
nel decoder used in our analysis. The performance analyses of ML channel estimation and PIC multiuser 
detection are given in Section III and Section IV, respectively. Based on these results, the corresponding 
iterative mapping is described and analyzed in Section V. Numerical results and conclusions are given in 
Section VI and Section VII, respectively. The notations used in this paper are explained as follows. 

• Throughout this paper, if no special note is given, we denote vectors with small letters in bold fonts, 
matrices with capital letters in bold fonts and scalars with non-bold fonts. 

• For any variable X, we denote the corresponding estimate from the decision feedback by X and the 
corresponding error X — X by 6X. 
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• Superscript T denotes transposition and superscript H denotes conjugate transposition. 

• I denotes the identity matrix. 

• \x\ denotes the smallest integer larger than or equal to x. 

• mod(i, j) denotes the modulo of i with respect to j, with the convention of mod(z, i) — i. 

• For a matrix A^xn. ||A|| = '\JY1T=\ 5^j=i ^fj Frobenius norm of A. 

II. Signal Model 

A. Signal Model 

We consider a synchronous uplink long code (aperiodic) DS-CDMA system, with identical channel cod- 
ing, binary phase-shift keying (BPSK) modulation, K active users, spreading gain N, system load (3 — j^, 
and identical transmission rates for all users. The transmitted symbols experience multipath fading. We 
adopt a block fading model and denote by M the coherence time, measured in the number of symbol peri- 
ods, over which the channel is stationary. Within a coherence period, the chip matched filter output of the 
receiver at symbol period t can be collected into an TV- vector given by 

K L 

r{t) = Y,bk{t)Y,ciki^ki{t) + n{t), t = l,2,...M, (1) 
fe=i 1=1 

where L denotes the number of resolvable paths per user, hk{t) denotes the channel coded binary symbols, aki 
denotes the channel gain of the /-th path of user k,Ski{t) denotes the binary spreading code with 1 1 s^; (t) 1 1 = 1 
received from user k along path / at time t and n(i) is an TV- vector of independent and identical distributed 
(i.i.d.) circularly symmetric complex Gaussian (CSCG) %oise variables with (normalized) variance cr^. It 
should be noted that although the assumption of synchronicity is valid in time division duplexing (TDD) 
systems, it does not hold for many frequency division duplexing (FDD) systems. However, as it will be 
shown, the results from the analysis of synchronous systems are also reasonably valid, though not exactly 
the same, in the case of asynchronous systems. 
For the system model, we have the following assumptions. 

Assumption II. 1: The channel gains {aki} are independently CSCG distributed with zero means and vari- 
ances J. We consider only the case of large L, which implies that J2f=i kfeiP ^ I, k = 1, K; thus all 
users achieve the same performance with maximal ratio combining (MRC). 

complex random variable is CSCG distributed if its real and imaginary parts are mutually independent Gaussian random variables with 
zero mean and identical variance. 
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Assumption 11.2: We ignore intersymbol interference (ISI) and assume that the spreading codes received 
along different paths of a given user are mutually independent {independent model). 

Assumption 11.3: Based on Assumption 111.21 the crosscorrelations Pkimn{t) — Sfc/(^)^Sm„(t) (note that 
Pkikiit) = 1) satisfy 

• E {pkimn{t)} = 0, if {k, I) ^ (m, n)\ 

• E {pli„,,Sf)) = jf, if (A;, /) ^ (m, n); 

• E {Pklmn{t)ppqrsit)} = 0, if (A;, I, 171, Ji) ^ (p, q, r, s). 

The above assumptions simplify the performance analysis substantially. Moreover, these assumptions are 
reasonable for practical systems due to the following reasons: 

• Assumption III. II is based on the fact that more propagation paths are resolvable in CDMA systems 
than narrow band systems, particularly in environments with abundant scattering (e.g., indoor environ- 
ment). With this assumption, we ignore the impact of the fluctuation of received power incurred by the 
multipath fading, and consider only the impairment caused by the channel estimation error. 

• Assumption lII.2l is unrealistic since these sequences are shifted versions of each other (shifted model). 
However, the accuracy of the results dependent upon this assumption is validated with numerical results 
in Section VI and asymptotic analysis given in Appendix 1. 

B. Receiver Structure 

The structure of receiver is shown in Figure 1. The channel coefficients are estimated in the channel 
estimator, which operates in a 'semi-blind' way. Training symbols are available to obtain an initial estimate 
in the first iteration. In the further iterations the information symbol decisions from channel decoders are 
assumed to be correct. Then, both the training symbols and fed back decisions are considered as training 
symbols and used for ML channel estimation. A multiuser detector is used to mitigate the MAI and its 
outputs are de-interleaved and decoded in the channel decoder. In the multiuser detector, we use the LMMSE 
algorithm in the first iteration and the PIC algorithm with the aid of hard decision feedback in the succeeding 
iterations. We follow the standard procedure in turbo multiuser detection lfT1l lfT3llll22llll30ll to reconstruct the 
channel symbols from the channel decoder output. Then these channel symbol estimates are interleaved and 
fed back to the multiuser detector and channel estimator to enhance the performance iteratively. 

We denote by bk{t) the estimated binary channel symbol of user k at symbol period t that is fed back from 
the channel decoder. For simplicity, we use hard decision feedback and denote the feedback symbol error 
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rate by Pe- The decision feedback error is denoted by 5bk{t) = bk{t) — bk{t). Supposing that both bk{t) and 
6bk (t) are symmetrically distributed, it is easy to check that 

. E {5bk{t)} = 0; 

. E {bkit)Sbkit)} = 2Pe; 

. E {6bl{t)} = 4P,. 

• E {6bk{m)6bi(n)} = 0, when {k, m) ^ (/, n). 

It should be noted that, in practical systems, soft decision feedback will achieve better performance than 
hard decision feedback. However, the performance of channel estimation with soft decision feedback is 
determined by both the first and second moments of the decision feedback error fTTl . Thus the corresponding 
analysis of performance evolution is more complicated than the case of hard decision feedback. Therefore, 
we adopt hard decision feedback in order to simplify the system performance analysis. 

For the decision feedback from channel decoders, we have the following reasonable assumption, which 
simplifies the analysis and is also used in ||T1|. 

Assumption 11.4: The codeword length is assumed to be large enough so that the transmitted symbols are 
coded over many coherence periods. The decision feedbacks | are mutually independent for different 

k or t. 

III. Performance Analysis of Channel Estimation 

In this section, we discuss the performance of channel estimation. First, we explain the training symbol 
based ML channel estimation algorithm that is used in the first iteration. Then, we consider the estimation 
of the channel coefficients with only hard decision feedback from the channel decoders. Finally, we extend 
the performance results to channel estimation with both training symbols and decision feedback, the latter 
of which is used in the further iterations. 

In applying the turbo principle, to avoid the reuse of information, only observations {r(t)}f^j are used in 
the channel estimation for multiuser detection in symbol period i. Thus the corresponding channel estimation 
error is independent of r(z). However, for simplicity of discussion, we still assume that all M received 
signals are used for the channel estimation while retaining this independence assumption. For large Af , this 
results in only a small error in the analysis. 

In the following discussion of channel estimation and PIC, we regard the channel gains {au} and the 
spreading codes {sfc/} as realizations of random variables. Only the transmitted symbols, decision feedback 



errors and noise are considered as random variables. Throughout this paper, all expectations, denoted as 
E{-}, are over the distributions of these three variables. Thus our results are conditioned on the realizations 
of {ttki} and {ski}. However, by the strong law of large numbers, we will see that we can obtain identical 
results for almost every realization of {a^i} and {ski} in the large system limit (K, N oo). 

A. Training Symbol Based ML Channel Estimation 

First we assume that there are M training symbols, channel symbols known to the receiver, within a single 
coherence period. For simplicity in deriving the channel estimate, we stack the chip matched filter output of 
the signal corresponding to these training symbols, rewriting ([T]) as 



r = Sa + n, 



(2) 



where 



r 


= (r^(l),.. 




n 


= K(i),.. 




a 


= (ail, 0,12, 




S 


= ((S(l)B(l)f ,...,(S(M)B(M)f)[ 



B(m) 



S(m) 



( 



bi{m)lLxL 

b2{m)lLxL 








y ••• bK{m)lLxL J 



KLxKL 
m = 1,...,M. 



Applying the ML criterion and the normality of the noise, we can obtain the ML channel estimate, which 
is given by 



arg maxP(r|a) 

a 

argmin ||r — Sa| 

a 

(S^S)-iS^r 

R V, 



(3) 



where R = S^S and y = S'^r. 

It follows directly that the channel estimation error is 

Sa = a — a 

= -R-^S^n, 

from which it is obvious that this error has zero mean and covariance Sa = -E [5a6a^^ = cr^R^^. 

For a finite M, we can compute trace {R^^} in the large system limit (i.e. when K, N ^ oo while keeping 
the system load, ^ = P, constant). For a system with system load (3, it is well known that as K ^ oo, 
trace {^R-i} '-^'^^^^S^^ '^he multiuser efficiency of a decorrelator, namely 1 — (3 lE^ . is equivalent to the 
covariance matrix of a system with equivalent system load P' = -^4 = -4/3. Thus as -ft^, ^ oo, we have 



trace {Sa} a. 



MN Ml- 

2 



M M-L(3 

Therefore, for sufficiently large K and A^, the variance of channel estimation error is given by 

2 

which can be approximated by ~ ^ when M is sufficiently large. 

It should be noted that, in asynchronous systems, we can remove part of the chips in the first and the 
last symbol periods to obtain a similar matrix SNM-dma^xKL, where dmax denotes the largest time offsets 
of different users, measured in chips. Since the training symbols have been incorporated into the spreading 
codes, we can consider the columns of S as random {NM — dmax)— vectors, regardless of the time offsets 
of different users. Therefore, the variance of channel estimation error in asynchronous systems is similar to 
that of synchronous systems when M is sufficiently large. 

B. Channel Estimation with Decision Feedback 

1 ) Algorithm: When decision feedback is used in place of training symbols to derive the 'ML' channel 
estimates^, a process that assumes that the decision feedback is free of error, the channel estimation error 
is caused by both the thermal noise and the decision feedback error. On applying dSj), the channel estimate 

^By 'ML' estimates, we mean using the expression obtained from the training symbol based estimation, but with symbols obtained from 
decision feedback. It is not an exact ML estimate since the distribution of the decision feedback error is not considered. 



with decision feedback is given by 



a = R^y 

= R-^S^(Sa + n) 

= a + R-iS^(5Sa + n), 

where 58 = S — S, R = S^S, y = S^r and S is the version of S in @ obtained from the decision feedback, 
which is given by 

S(1)B(1))^,...,(S(M)B(M))'^) 



Bfm) 



bi{m)lLxL 

b2{m)lLxL 








y ••• bK{m)lLxL J 

Hence, the channel estimation error can be decomposed into two parts 

Sa = -R-iS^(5Sa + n) 



KLxKL 



(5) 



where 5af = — R~^S^5Sa and (5a„ = — R^^S-^n denote the channel estimation error due to the decision 
feedback error and the thermal noise, respectively. It is reasonable to assume that 5af and (5a„ are mutually 
independent. (Recall our assumption concerning the use of only measurements t ^ iin estimating gains at 
time i.) 

It is difficult to tackle the calculation of 6a due to the matrix inversion R^. However, we can approximate 
1 Ikl^kl ^j^gj^ sufficiently small. This approximation is justified by the following lemma. 
Lemma III.l: When fixing K and N, we have 



KLxKL, 



almost surely ^ as M — > oo and Pe ^ 0. 

Proof: According to the definition of R, we have 



R 



-1 



R^ + R A 



Here, a matrix is considered as a point in the probability space and the metric is induced by a matrix norm. 
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where A = (I - 5RR 



I. According to the error analysis of matrix inversion in f\T\, we have"* 



^{||A||^}<^ 



which tends to as Pe 0. Thus, we have 



||(5RR-iF 
1 - pRR-i||^ 



0{Pe), 



E 



R 



-1 



R 



-1 



j<||R-i||^i?{||A||^}^0, 



as Pe 0. Therefore, R ^ converges to R ^ almost surely as P^, 0. 

Applying the strong law of large numbers and the fact that the diagonal elements in 



M 



R = J](B(m)S(m))^S(m)B( 



m=l 



are M and the off-diagonal elements in (B(m)S(m))^S(m)B(m) are independent for different values of m 
and have zero mean, we obtain that, while keeping K and fixed, ^ — > Irlxkl almost surely, as M — > cxd. 
Since the elements of R~^ are continuous functions of those in R in a neighborhood of R = MIklxkl, we 
also have MR"^ —^Iklxkl^sM^oo. This completes the proof. ■ 
Therefore, we can further approximate R^^ by ^^^^^'^ for large M and small Pg. For simplicity, our 
further discussion of 5af will be based on this approximation, which will be validated by numerical results. 
Consequently, in the following discussions, we use the approximations 

1 



5a 



f 



M 



S' SSa, 



and 



S^n = — ttS n. 

M 



2) Covariance matrix of channel estimation error: We denote the covariance matrices of 5a, 5af and 
5an by Sa, Sf and En, respectively, which satisfy Sa = Sf + Sn- We first consider the channel estimation 
error incurred by decision feedback errors. The following lemma shows that the channel estimation error 
5a f is asymptotically biased. The proof is given in Appendix II. 

Lemma III.2: When keeping K and fixed, we have 



E{5af} ^ 2Pea, 



(6) 



almost surely, as M — > oo. 

"^x = 0(Pe) means -p- < oo as Pe — > 0. 
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It should be noted that this bias cannot be removed a priori in the estimator since it is dependent on the 
channel gain, a. However, this bias vanishes as Pg 0. 

An asymptotic expression for the elements in Sf is given in the following proposition, whose proof is 
given in Appendix III, where we also explain that the conclusion also applies to asynchronous case when Pg 
is sufficiently small. 

Proposition III. 3: For all i and j, when fixing K and A^, we have that (recall that a^i is the channel gain 
of use k and path /) 

4Pe (kr-i], mod(i,L)P + J2k=l, kjti I'^Trl' rnod(k,L)\^^ ' if i = j, 

4Pe (1 + jj) ap^], W(i,L)<Xl wflLV ^^'^^ ^il = Til ' ' 



almost surely, as A<f ^ oo. 

For 6sLn, which is caused by thermal noise, the corresponding analysis is identical to that of training 
symbol based estimation. Then, we have 

MS„ = Mcov (^R-^S^n' 

(^n^KLxKLi (8) 

almost surely, as M ^ oo. Then the covariance matrix of channel estimation error Ti^ = E |5a5a^} = 
Sf + Sn can be obtained from O and ©. 

3) Variance of channel estimation error: The variance of channel estimation error can be obtained as a 
corollary of the previous subsection. 

Corollary 111.4: On defining = -^trace {Sa}, we have 

APJ1+I3L) . 
MA, ^ -^L-J^ + al (9) 

almost surely, as K, N, M oo. 

Thus, when K, N, M are sufficiently large, we have the following approximation 

LM M ^ ^ 

It should be noted that the channel estimation error cannot be removed by increasing M although the 
variance vanishes as M oo, since the estimate is biased and the bias cannot be removed a priori. 
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C. Estimation with Both Training Symbols and Decision Feedback 

We denote the number of training symbols by Mt and the corresponding percentage by a = When the 
training symbols and decision feedback are combined for channel estimation, the performance is determined 
by dlOb . with replaced by (1 — a)Pe- Decision feedback should only be used along with the training 
symbols if the resulting variance is smaller than that obtained when only the training symbols are used. 
Then it is easy to check that, when M and Mt are sufficiently large, Pemax, the maximum Pg assuring 
performance improvement when decision feedback is used, is determined by 

4(l-a)Pe(l + /3L) ^ aj 

LM M - Mt' 

which results in 

P = "ik (11) 

from which we observe that Pemax decreases with a and (5 while increasing with cr^ and L. 

IV. PIC AND Channel Decoder 

A. Performance Analysis of PIC 

For convenience of analysis, the performance of PIC is analyzed based on matched filter (MF) outputs. 
We drop the index of the symbol period for notational simplicity throughout this section. For a given symbol 
period, the MF outputs, which form sufficient statistics for multiuser detection, are given by 

y = S^r. 

In PIC based multiuser detection, the MAI reconstructed from the channel estimates and the decoder output 
is subtracted directly from the MF output of the desired user. Without loss of generality, we take the /-th 
path of user 1 as an example; then the MF output after PIC, which is contaminated by residual MAI and 
thermal noise nu = sf^n, is given by 

yn = CLiihi + ^ aimPiiimhi + hu (12) 

where 

K L 



hi — Pllkm [o-kmbk — O'kmhk^ + 



k=2 m=l 
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which is the sum of the residual interference and the thermal noise. It is obvious that E{Iii} = 0. And the 
corresponding variance is given by 

^ K L 

aj = E{\Iiif} = —'^'^E{\Sakrnbk + Shakrn-SakrnSbkf} + E{\niif} 

k=2 m=l 
^ K L 

= j;^J2Y.{^ {\Sakmf} + 4Pe \akmf + ^PeE { |5a,^|'} + 2E {dakmalJ E {h5hk] 

k=2 m=l 

-2E{\5ak.^\']E{hk5h} - 9,P,E{akm5aU } + 
^ (3L/\, + AP{l-P,)P, + al, (13) 

as K,N —>■ oo, where we have applied the fact that E {\dakmf} = + 4:P^ \akm.f, E {akm^al^} = 
2Pe \akmf, E {bkSbk} = 2Pe. It is easy to check that aj is identical for asynchronous systems since different 
time offsets do not affect the interference power. 

It is difficult to apply the central limit theorem to show the asymptotic normality of the PIC output since 
the variables {6atm} are mutually correlated across different users and paths. However, numerical results in 
Section VI will show that the output distribution of PIC can be well approximated by a Gaussian distribution. 
Thus, in the subsequent sections, we assume that the output of PIC is Gaussian distributed. 

According to the properties of the crosscorrelation given in Section II.A, pmm almost surely, as 
— i> oo. Thus, for large spreading gain, the interference across different paths of the same user can be 
ignored. With the normality assumption of the residual MAI, it is easy to show that the variables {Iii}^^^ ^ 
are mutually independent as ^ oo, which means that channel coded symbol bi is transmitted through L 
independent channels. This assumption simplifies the analysis although it does not hold exactly when is 
finite. Thus, we use MRC to collect these L replicas, resulting in the output 

L L 

zi = ^ aliUiibi + ^ alihi. (14) 

1=1 1=1 

Applying Lemma lTlI.21 we obtain that, as M, L ^ oo, 

L L 

1=1 1=1 

oo 

l-J2E{6al,}a,i 

1=1 

oo 



l-2PeE 



aiil 



|2 



1=1 



1 - 2Pe. 
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Moreover, we can obtain that, as M, L 



oo 



2 



L 



1=1 
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1=1 




( 



1 



2^E {5aliaii} + 



L L 




((1 



Therefore, when M and L are sufficiently large, ([T?b can be approximated by 



(15) 



where ni is a CSCG random variable with variance of ((1 — 2PeY + LAa)cr|. An interesting observation 
is that the channel estimation error not only increases the interference but also decreases the valid received 
power of the desired user. 

B. Performance of Channel Decoder 

At the channel decoder, Pg is a function of the input signal-to-interference-plus-noise ratio (SINR) at the 
input to the channel decoder given by 



where the function g can be estimated using Monte Carlo simulations. For most practical channel codes, the 
following assumption is reasonable: 
Assumption IV.l: Within a closed interval Vl = [0, cr]""^], function g satisfies 

• g{x) monotonically increases with x, and ^'(0) = 0; 

• g{x) is continuously differentiable and g'{0) = 0. 



In this section, we analyze the overall iterative system shown in Figure 1. We consider only the case 
of small Pe, moderate and moderate M and note that the analytic results become more precise as Pe 
and cr^ decrease and M increases. This configuration is reasonable for the decision feedback basedsystems 
since if M is large, training symbol based channel estimation can be adopted with marginal loss of spectral 




(16) 



V. Analysis of System Performance 



15 



efficiency; if M is small, it is difficult to carry out coherent detection; and if Pe is large, the iteration diverges. 
Although the performance analysis of the channel estimation in Section III is based on large M, numerical 
results in Section VI indicate that expression dTUt is still valid for moderate M. We adopt the expressions 
(fTOb and (fT3t in large system limits (iT, oo). 

A. Iterative Mapping 

In this section, we consider the rf-th iteration and couple the results from Section III and Section IV 
to analyze the overall system performance. We can regard the decoding process as an iterative mapping 
/i : M ^ R in terms of the error probability of the decoder output after the d-th iteration, Pe'^^ , which is 
given by (recall that g is defined as the function characterizing the output error probability in terms in input 
SINK in (d)) 



where we ignore terms of a smaller order than Pe and jj since we assume small P^ and large (or moderate) 
M. Based on dlOb . ([T3t and dTSt . the coefficients Dq and Di are given by 



B. Condition for Convergence 

A reasonably good initialization, which results in sufficiently small channel estimation error and MAI 
in the first iteration, is necessary to guarantee the convergence of the iterative mapping described in dlTt . 
In the initial stage, only training symbols are used for the channel estimation since no decision feedback 
is available then. Any non-iterative multiuser detection technique can be applied to the initializing stage. 
For practical applications, we can use the LMMSE detector, whose performance using imperfect channel 
estimation can be obtained using the replica method lITSl . 

For convergence, the variance of input interference and noise of the initializing stage, denoted by erf (0) 
and obtained from the SINR of the LMMSE detector, must satisfy the following conditions: 

• cr|(0) is located within the interval Vl defined in Section IV.B, namely 



e 



g {Do + DiPi^-i)) , 



(17) 




aj{0) < a J 



.max 



(18) 
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This condition assures a reasonably good initial performance of the iterations. 
• The variance of interference and noise decreases with iteration time, namely 

swim < (19) 

This condition assures that the iterations do not diverge. 

C. Condition Assuring the Uniqueness of the Fixed Point 

If there exists more than one fixed point, the iteration may become stuck at a suboptimal fixed point and 
not converge to the optimal one. The following proposition provides a sufficient condition for the uniqueness 
of the fixed point and the corresponding convergence rate. 

Proposition V.l: (1) If there exists a 7 < 1, such that 

D, < ^— (20) 

max^.gn {g'{x)) 

then there exists only one fixed point Xf for the iterative mapping Xk+i = h (x^), and for every initial point 

k 

Xq E the mapping converges to xj with an exponential rate, namely — < ||xo — 

(2) If there exists an xi G f2 such that -77^ < Di < then there exists a Dq such that there is more 
than one fixed point for h. 

Proof: (1) The condition Di < J^(g'(x)) i^npli^s that h'{x) = g'{DQ + Dix) < 7 < 1. Then h{-) 
is a contraction mapping, and the conclusions follow due to Banach's fixed point theorem lfT4ll . 

(2) Letting Xf = g{xi) and setting Dq = xi — DiXj, we can show that Dq > due to the assumption that 
Di < ^^Y) = fy- It is easy to check that is a fixed point and g'{DQ + DiXf) = Dig'{xi) > 1. Hence, 
there exists an e > such that for all x G {xf,Xf + e), g{DQ + Dix) > x. However, g{DQ + D1X2) < X2 
for X2 = g (erf (0)) due to condition (fT9t . If X2 < Xf, there exists at least one fixed point within (0, X2) since 
g{Do) > 0; if X2 > xj, there exists at least one fixed point different from Xf within (xf, X2). ■ 
It should be noted that condition (l20b is sufficient but not necessary for the uniqueness of the fixed point. 
This condition is more stringent than the condition of convergence in (fWt since it assures both the uniqueness 
of the fixed point and the exponential convergence rate. The second part shows that a moderate Di may cause 
multiple fixed points. A useful conclusion drawn from (l20b is that this iterative procedure does not work 
well for those channel codes, such as powerful turbo codes or LDPC codes, that have a steep performance 
curve (bit error rate versus SINR) which implies a large value of max^-gQ (g'(x)). This will be demonstrated 
in numerical simulations in Section VI. 
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D. Asymptotic Multiuser Efficiency 

As is described in the asymptotic multiuser efficiency measures the slope at which the bit-error-rate 
goes to zero in logarithmic scale, giving intuition into the performance loss from multiuser interference. 

Suppose that there is only one fixed point for the iterative mapping h, and let -Pe(c"^) be this fixed point 
when the noise power is cr^. Similarly, let DQ{a'^) and -Di(o-^) be the corresponding values of Dq and Di in 
dnl). It is obvious that Fe(0) = and Do(0) = 0. 

The asymptotic multiuser efficiency is given by 

,2 



AME 



lim 

1 



+ 



d(Di(a2)P4^2)) 



If H{Pe,al) = g{Do{al) + Di(cr^)Pe)-Pe, then Pe(o-^) is the unique solution of if(Pe, a^) = 0. Applying 



the assumptions that g'{0) = and Pe{0) = 0, we have 



da?. 



Dm 



dal 



-DM 



<rl=0 







dPe 





-D^iO)- 



diDoial)+Di(ai)P,) 



Di{0)g'{0) - 1 



0. 



Thus 



AME 



L(3 ■ 



1 + ^ 



(21) 



From we can see that the loss of AME is due to the channel estimation error incurred by the thermal 
noise. The impact of the decision feedback error vanishes as cr^ 0, while that of the channel estimation 
error remains. 



E. Computational Aspect 

The main computational cost of the iterative channel estimation and multiuser detection includes: 
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• Solving the linear equation Ra = y for ML channel estimation. 

• Reconstructing the channel symbols and cancelling the interference. 

• Channel decoding. 

Since the channel symbol reconstruction is similar to the encoding procedure and the interference cance- 
lation requires only subtractions, this is not a bottleneck of the whole procedure and the corresponding 
computational cost is of complexity 0{K). Real-time channel decoding can also be accomplished in a way 
similar to Turbo codes. Therefore, the main bottleneck is solving the linear equation for channel estimation. 

Direct Gaussian Eliminatation, which is of complexity 0{K^), can be applied to solve the equation Ra = 
y when K is small. When K is large, iterative techniques of solving linear equations, such as the Jacobi 
method and the Gauss-Seidel method, can be applied. For assuring the convergence, we cite the following 
lemma from ifTOl : 

Lemma V.2: The sufficient and necessary condition for the convergence of iterations in solving the linear 
equation Ax = y is that 

• A and 2 diag(A) — A are both positive definite in the Jacobi method^; 

• A is positive definite in the Gauss-Seidel method. 

The Gauss-Seidel method always converges when (3 < 1 since R is positive definite when K < N. For 
the Jacobi method, it is easy to check that diag(R) = Ikxk- Since the largest eigenvalue of R converges to 
(l + [|3J almost surely as K, N oo, the eigenvalues in 2 diag(R) — R are less than 2 — (l + v7^)^ 
almost surely in the large system limit. Therefore, < 1 is a sufficient condition for the almost sure 
convergence of Jacobi iteration in the large system limit. Then, when K and are sufficiently large and 
K < N,we can use either Gauss-Seidel or Jacobi iterations to estimate the channel coefficients efficiently. 

VI. Numerical Results 

A. Channel Estimation 

Figure 2 shows the average variance of the channel estimates versus the coherence time M with the 
configuration of (3 = 0.2, L = 5, Mt = 0, Pe = 0.1 and the signal-to-noise ratio (SNR)= 5dB^. The 
asymptotic results obtained from dTUb and the simulation results for finite systems (A^ = 100) with spreading 
codes for the shifted model are represented by solid and dotted curves, respectively. In this figure, the 

''diag(X) denotes a diagonal matrix constituted by the diagonal elements in matrix X 

''Note that Pe and SNR are not mutually independent; however, we set these two parameters arbitrarily to test the validity of asymptotic results. 
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estimation error variance caused by decision feedback and noise are denoted by A/ and A„, respectively. 
The corresponding asymptotic results are obtained from the first and the second terms in (fTOI) . respectively. 
We can observe that the asymptotic results match the simulation results well even when M is small. This 
figure also demonstrates the validity of results based on the independence assumption of the spreading codes 
given in Section II.A. 

B. Normality of PIC Output 

Figure 3 shows the channel symbol error rate^ with the configuration of SNR = lOdB, K = = 30 
and Pe = 0.1 and 0.05. The solid curves represent the results obtained from numerical simulations and the 
dashed curves represent the results with the assumption that the output of PIC is CSCG distributed. The gap 
between the numerical results and CSCG based prediction is small, thus justifying the normality assumption 
of the PIC output. 

C. User Capacity 

We define the user capacity to be the maximum system load Pmax with which the system can achieve 
the information bit error rate of 10^'^. Two types of channel codes, the convolutional code (35, 23)8 and a 
turbo code (with two constituent codes (37, 21)8), with bit rate R = ^ and codeword length 1024 are used 
in this paper and their error rates for both information bits and extrinsic information based channel symbols 
are shown in Figure 4. The corresponding /Umax's for various values of coherence time M, denoted by 
'iterative', are given in Figure 5 and Figure 6 for convolutional codes and turbo codes, respectively, with the 
configuration a = 0.2, SNR= 5dB and L = 5. The Pmax'^ of the non-iterative LMMSE detector, denoted 
by 'LMMSE', are given for comparison. We can see that the iterative system achieves substantially higher 
user capacity than the non-iterative one. The performance of systems with ideal initialization, where actual 
channel parameters are provided by a genie in the initialization stage, denoted by 'Perfect initialization', 
implies that a good initialization can improve the performance considerably. Thus, blind or semi-blind non- 
iterative techniques, which make use of information symbols, can be applied to obtain a better initialization. 
For comparison, the user capacities of both iterative and non-iterative systems with perfect channel state 
information are also given in both figures. An interesting observation is that the relative performance gain 

^This channel symbol error rate is equivalent to bit error rate when the output of PIC is used directly for the detection (without channel 
decoding). 
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of iterative systems over the non-iterative ones is smaller for turbo codes than for convolutional codes. This 
is due to the steeper waterfall region in turbo codes. 

VII. Conclusions 

In this paper, we have analyzed the performance of decision feedback based iterative channel estimation 
and multiuser detection in multipath DS-CDMA channels. The decoding process has been described as an 
iterative mapping in terms of the variance of the channel decoder output, and conditions assuring the conver- 
gence and uniqueness of a fixed point have been proposed. Numerical results show that the initialization is 
important to the iterations, thus necessitating the use of non-iterative blind or semi-blind channel estimation 
algorithms for initialization purposes. Another observation of interest is that the gain of the iterative process 
over a non-iterative one is small when a near-optimal channel coding scheme is used. 

Appendix I 

Validity of Independence Model for Spreading Codes 

In (1), for different values of I and m, Ski and s^^ are generated by the same binary sequence with different 
offsets. Our purpose is to show that if K and N are large enough, we can regard the shifted spreading codes 
of different paths of a given user as independent sequences. The properties based on this assumption, which 
are used for the system performance analysis in this paper, include: 

• The properties of crosscorrelation Pkimn in Section II.A. 

• The distribution of the eigenvalues of the matrix SS^, when developing the expression of A„ for finite 
M and large K in Section in.C. Our assumption means that the corresponding distribution of the shifted 
model is asymptotically identical to that of the independent model. 

It is easy to check the first item using the symmetry of the binary distribution. However, the validity of 
the second one is non-trivial and is of considerable importance when applying the theory of large random 
matrices to multipath fading channels. We can tackle this problem by showing that the moments of the 
eigenvalues in both models are the same via the following lemma. 

Lemma I.l: Denote a generic eigenvalue of SS^ by A. Then the m-th moment of A in the shifted model 
is given by 

m 

^{An^E^/^')' E c(mi,...,mfe), asK^oo, 

k=l mi+...+mj;=m 
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which is the same expression of that of the independent model, and where the definition of c(mi, m^) is 
given in ifT^ and 0' = 

Proof: Using similar arguments to those in lfT9l . we have 

1 



E{trace{(SS^)™}} 

^ K N 

il,...,im = l jl,...,jm = l 

where Vij = ^/NSij. 

For any i,. ^ is, Vi^j^ = Vi^j^ when [^1 = [^1 and jp — jq equals the offset difference between these 
two shifted sequences. However, the probability of such events vanishes as K — > cx) since 

(m \ 2L + 1 
l-^^O, asK^oo. 

Thus, as -f^ ^ CX3, the term involving Vi/s of different users, which are mutually independent, dominates 
the summation in (|22ll . The remaining part of the proof is the same as in lfT9l . ■ 

The following lemma (Theorem 30.1 in 1 5 1) provides a sufficient condition for the equality of two proba- 
bility measures when their moments are identical . 

Lemma 1.2: Let /i be a probability measure on the real line having finite moments au = x^^{dx) of 
all orders. If the power series Yl^=i '^kji has a positive radius of convergence, then /i is the only probability 
measure with the moments {am}m=i 2 ■ 

For applying Lemma 1.2, we need the following lemma which provides an upper bound for the moments 
of the eigenvalues. 

Lemma 1.3: For any eigenvalue A of SS^, there exists a constant C > max(l, [3') such that for m = 
1,2,... 

E {X""} < C"'m"'-\ (23) 
Proof: The result follows by induction on m. 
It is easy to verify that (l23t holds when m = 1,2. Suppose E {A"} < C"n"~^, for n = 1, 2, m. Use 
the following recursive formula lfT9l to evaluate E {A^^^}, which is given by 

m+l 

E {X^+^} = J2 f3' Yl E {X"^'-^} ■ ■ ■ E {X""'-^} . 

k=l mi+...+mfc=m+l 
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Then we have 

(m—l 
l + mE{X} + E{X"'} + Y, Yl E [X""'-^} ...E {X""'~^} 
k=2 mi + ...+mfc=m+l 

(m—l k 
l + m(3' + C"'m"'-^ + Yl n 

k=2 mi + ...+mfe=m+l i=l 

(m—l 
l + m(3' + C^rrT'^ + ^ ^ C""+^- 
fc=2 mi + ...+mfe=m+l 

m—l 



,_mi— 3 



m-2 



< C""+^ 1 + m™-^ + ^ 







\k-l) 




I m — l 






[ k 





fc=o y k 

= C"^+^(l + m)'""\ 

where the first inequality is based the assumption on n = 1, m and the fact that E{X} = [3'\ the third 
inequality applies the condition that C > max(l, /?') and m"^"^ > m™~^ + m for m > 2. This concludes 
the proof. ■ 
Applying Stirling's formula and Lemmas 1.1,2,3, we can obtain the conclusion that the eigenvalue distri- 
bution of SS^ in the shifted model is identical to that of the independent model, thus assuring the assumption 
that the columns of S can be regarded as independent in the large system limit. 

Appendix II 
Proof of Lemma UTOl 

Proof: From the definition of (5a/, we have 

E{5af} = {E{S^6Sa} - E{6S^6Sa}) . (24) 

We consider the term E'j^S^^Sa} first. It is easy to check that (recall that s^.^ denotes the spreading code 
of user k along path /) 

1 ^ 

^El(^f^^^^) \ = —' 



1 1 ^ 

-e{{6S^6S)^^} = j^Y^l('^>rs{m)E{5b,SK} 



m=l 
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0, ifpT^r 

4Pe T 



M 



where p = Q = mod (i, L), r = [|;] , s = mod {j, L). It should be noted we applied the fact that 
E {6bp6br} = APe in the second equality. 

According to Assumption lIL31 the spread codes are mutually independent for different users or different 
paths. Thus, by applying the strong law of large numbers, we have 



1 ^ 



Therefore, we have 



Similarly, we can show that 



i-i?|(S^5S).. 



This completes the proof. 



0, if (p, g) ^ (r, s) 



0, ifz^j 
^ if i = ?■ 



0, ifz^i 



almost surely, as M ^ oo 



almost surely, as M — > cxd 



Appendix III 
Proof of Prop. IIII.3I 

Proof: The covariance matrix S j is given by 



M2' 

- -^E {(5S^(5Saa^5S^S^} + {5S^5Saa^(5S^(5S} 

- E{5^f}E{5^f}^ . 



M2' 
M2- 



(25) 



The elements in S^Saa (5S S are given by 

M M KL KL 



(S^^Saa^^S^S)^^. = ^ ^ sf (p)<5s,(p)sj(g)5s,(g)a,a;, 



p=l g=l k=l 1=1 
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where Sj(p) = b^j_^{p)s^j_^^^^^^-^{p), namely the spreading code (incorporating the channel symbol) of 
the mod(i, L)-th path of user \j] at symbol period p, 5si{p) = 5b^j_^ ip)^ij-],mod{i,L)ip) ^fe the k- 
th element of vector a and equals ap-| ^^^^^ j^y To compute the corresponding expectation, we apply the 
following properties, which are based on Assumption lII.41 

• Whenp = q, if [-1] = P{6sk{p) ^ 0,5s;(g) 7^ 0) = Pe, since 6sk{p) and Ssi{p) are determined 
by the same decision feedback; 

• When p = q,if \j] ^ P{5sk{p) 7^ 0, Ssi{q) 7^ 0) = P^, since 5sk{p) and Ssi{p) axe determined 
by decision feedback from different users; 

• When p ^ q, P{Ssk{p) 7^ 0,Ssi(q) 7^ 0) = P^, since Ssk{p) and 5si{p) are determined by decision 
feedback from different symbol periods; 

. When 5sk{p) ^ 0, 6sk{p) = 2sk{p). 

Thus the expectation of i — jth element of S^5Saa^5S^S is given by 

£;|(S^(5Saa^5S^S),^ 

M KL 

= 5Z 'yP)^k{p)^^j {p)h{p)^k^*i 

p=i fc=i r|]=r|] 

M KL 

+ E §f(p)§.(p)§J(p)§Kp)a.ar 

M KL KL 

^ fc=i 1=1 

p,q = l 

= T. + T^ + Ts, 

where Ti, T2 and T3 represent the corresponding three summations, respectively. 

Applying the strong law of large numbers and the assumption on the spreading codes that {sj(p)} are 
independent for different values of i or p, we can obtain that, as Af ^00, the following conclusions hold 
almost surely: 

M^i ^ ] 4P,(l + i)a,a*, if z ^ j and [f] = ] , 

0, if ril ^ Til 
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^ ' 1 0, if Til = \i] 

We can apply the same manipulation and obtain that E {S'^ 5Saa^ 5S^ 5S} = E {(5S^(5Saa^(5S^S} = 
^E {5S^5Saa^5S^5S} as M ^ oo. Therefore, we can obtain ^ since the sum of the middle three terms 
in (l25t is zero and cancels E{6af}E{6aLf}^ . ■ 

It should be noted that the above analysis is also valid for asynchronous case when Pg is sufficiently small. 
Similar to the discussion in Section III.A, we can remove part of the chips in the first and the last symbol 
periods to obtain a similar matrix SNM~dmaxxKL, where dmax denotes the largest time offsets of different 
users, measured in chips. When P^, is sufficiently small and M is sufficiently large, we can ignore the terms 
scaled by P| and the edge effect in the first and last symbol period. Then, we have 

KL 

E{(S^(^Saa^(5S^S)^.} ^4Pe J] Yl s.sjs^a^a;, 
where is the k-th column of matrix S, which converges to Ti as M ^ oo. 
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